Conversation
|
Is there a way to add scripts in tests/ to demonstrate that this new code is working? |
|
|
Planner 0/1 are unaffected, the TP works in Cartesian space and the RT module's kinematicsInverse/kinematicsForward symbols are resolved at load time as usual. Any user-written kinematics module works fine. Planner 2 has a problem: it needs a userspace reimplementation of each kinematics module's math (for Jacobian computation, joint-space limit enforcement, path sampling). Currently kinematics_params.h enumerates known modules, and kinematicsUserInit() hard-fails for anything not in the list, aborting trajectory init entirely. Three options to preserve compatibility with custom kinematics in Planner 2: Fallback to identity kins in userspace, Treat unknown modules as trivkins for the userspace layer. RT still uses the real module. Joint limit enforcement would be approximate but conservative. Downgrade to planner 0/1, If userspace kins init fails for an unknown module, automatically fall back to planner 1 (or 0) with a warning. Simplest fix, preserves "any kins works with any TP". Generic RT-userspace bridge, Add a KINS_TYPE_GENERIC that calls the RT module's forward/inverse via shared memory. Correct but slower, and requires a new communication channel. Which approach would you prefer? |
The new code is not really done yet, I'll work on tests after things are stable enough, G64 is still not implemented, rigid tapping is missing, adaptive feed not tested, a few more things... |
|
at the moment still hardening the Feed Override system, it is quite complex still have not squashed all possible ways things could go wrong, but getting closer each day... |
|
Preserving the generic plugable nature of kinematics across trajectory planners is a very nice feature and should be preserved if possible. |
Good point. The math is actually already shared, each kinematics module has a *_math.h header (e.g. 5axiskins_math.h, trtfuncs_math.h) with pure static inline forward/inverse functions and no RT dependencies. Both the RT module and the userspace lib call into these same headers. What differs is the glue code around the math. The RT side creates HAL pins with hal_pin_float_newf() and reads them by direct pointer dereference, while userspace walks HAL shmem by pin name string through hal_pin_reader. Init is hal_malloc() + switchkinsSetup() + EXPORT_SYMBOL() on the RT side vs calloc() + function pointer dispatch in userspace. Logging is rtapi_print() vs fprintf. Trying to unify these into a single .c would mean heavy #ifdef RTAPI scaffolding around everything except the math, which is already shared. The real obstacle for custom kinematics in planner 2 isn't math duplication, it's that kinematics_user.c needs to know the module exists at compile time (enum entry, function pointers, HAL pin names for refresh()). A user-written RT module has no matching userspace entry. A possible path: let custom modules optionally ship a mykins_userspace.so implementing a standard kins_userspace_init() API, which the planner dlopen()s at runtime. That would make planner 2 pluggable the same way RT kins already are, without enumerating every module. If that's too heavy, falling back to planner 1 for unknown modules is the simplest safe option. If you are able to conjure up a method to make this work, I'd be glad to implement it. |
That is the real problem. You have static enumeration instead of a dynamic plugable system. Why do you need enumeration? Your interface into the kinematics should be generic. The real question is, when the math is shared, what glue is required for userspace to be usable with the new planner and what is the glue code for realtime. The glue-code should be the same for each and every kinematics for the interface. Therefore, you only need to devise a way to compile the kinematics modules so they give you two resulting loadable modules, one for realtime and one for userspace.
That is how I think it is supposed to be, yes. Just like the realtime kinematics. You probably only need to be able to load and not to unload modules (the kinematics is set in a configuration file and cannot be changed at run-time).
Have a look at A similar strategy will also work for userspace kinematics. |
|
Would it be possible to pass an ID value to the kinematics modules and use that instead of the enum? Then users could configure / customize it . |
|
Deep into hardening feed override system, I'll have a better look at the kins probably tomorrow, thank you for the input, I'll try my best to make it work, I'm sure there is a possible approach. |
|
The last PR addresses all jerk spikes I was able to find, my testing method was running gcodes with small segments, at high feed rate, and having a script wildly swing the feed override, the feed override hand-off branching system is now basically bullet proof as far as I've tested, and I have tested it a lot. |
|
Implemented the dlopen plugin approach. Here's what changed: The kinematics_type_id_t enum and map_kinsname_to_type_id() are gone entirely. No IDs, no dispatch switch. The module name string is the identity — it maps directly to _userspace.so. Each kinematics module now ships a small plugin (50-150 lines) that exports one symbol: kins_userspace_setup(). The loader does dlopen(EMC2_HOME "/lib/kinematics/" name "_userspace.so"), calls setup, and the plugin sets its forward/inverse/refresh function pointers. Built-in and custom modules are loaded identically. The 17 built-in kinematics were extracted into self-contained .c files under plugins/. They reuse the existing *_math.h headers (pure math, no HAL deps) — same shared code that RT uses. The glue is minimal: read params from ctx->params, call the math function, done. If planner 2 is requested but the plugin .so doesn't exist (custom kins without a userspace plugin), it warns and falls back to planner 0 instead of aborting. Custom kins still work fine on planners 0/1 as before — they just won't get planner 2 until they add a _userspace.so. kinematics_user.c went from ~1500 lines to ~280. The shared memory struct changed (removed type_id field) |
|
Good to hear you implemented dlopen(). But I'm still not sure why you moved the actual forward/reverse kinematics calculation into a *_math.h header file. That seems to defeat the one source file and two glues. Header files are usually a very bad place for code. Header files are there as an interface layer. Sure, using "static inline" qualifiers makes them local, but that is, IMO, a very bad habit. What I had expected was:
Or is the *_math.h header a remnant from the previous code iteration? |
|
In C, code in header files is indeed not a common practice. If we don't want to abandon RTAI just yet, I think it is usually not possible to link one object file to a kernel module and to a normal program. IMO #inlcude-ing the code is not a bad idea in this case, also keeps the build system out of the loop. The sources to be included could be renamed to a different ending like |
Creating a .ko can be done from multiple .o objects, just like creating an .so can be created from multiple .o objects. There is no difference afaics. Only the userspace/kernelspace interface/glue layer is different, which is done in rtapi. I only propose to add some glue to differentiate between linking RT and non-RT kinematics modules. I don't think the non-RT kinematics can run in kernel space. @grandixximo must pitch in here to make that assessment whether the non-RT kinematics could ever be a kernel module.
I don't think we should be using this type of code inclusion at all. And for RTAI, it seems that development has stopped completely. I'm not sure it is worth the effort to keep it in very much longer. There is already a lot that does not work with RTAI anyway. |
|
I don't think they can be a kernel module, because they run on a userspace thread, but I'm no expert, and have not really explored this deeply yet. |
|
I have no problem if kernel-mode stuff is abandoned, but then it should be stated, and all that C / C++ schisma could be resolved / isn't needed in new code, so the whole split would be pointless. If we want to keep kernel support for now, linking one object into userspace and kernel objects is of course possible in theory, but it is asking for trouble. math stuff is handled differently for one, you can't just include <math.h> in kernel code, rtapi_math.h has conditional compilation depending on KERNEL or not. It may work to link stuff compiled against the "wrong" prototypes, but that is a hack at best. There may be other problems like LTO, autovectorization, calling conventions, frame pointer and in worst case it would break on "wrong" kernel configs. I don't think it's worth it just to get rid of a |
|
I had a better look at this. I considered the single .o approach, but the *_math.h pattern has advantages, for example in BUILD_SYS=normal (kernel), RT objects are compiled with -nostdinc and kernel includes, so the same .o can't serve both contexts. The math headers work for both build systems with zero #ifdef. They could be renamed to .inc if the .h extension bothers, but static inline functions in headers is the same pattern the Linux kernel uses extensively (list.h, rbtree.h, etc.), so unless the kernel also has bad habits, I think it's fine. |
|
I might be wrong about this, but I explored this for a while before settling on the shared header approach. The alternative would be splitting each *_math.h into a .h (prototypes) and .c (implementation), then compiling the .c twice with different flags and updating the link rules for every module. It's doable but adds significant Makefile complexity for the same result. If you'd prefer that approach I can implement it. |
|
If I understand it correctly, the loadable kinematics module is not going into the same process space for RT (rtai_app) and non-RT (milltask?). When your new TP cannot load into the kernel (RTAI) then we do not need to consider that option too seriously, just enough to bypass in compilation. In the case of uspace, why can't the same kinematics .so be loaded into two different processes and perform their specific function in the process' context? The motion controller links directly into the kinematics{Forward,Reverse} functions, which means that the kinematics .so must be loaded before the controller's .so to satisfy the dynamic linking process. If you also export appropriate functions for your non-RT process hook, then you could, in principle, load the same .so in both processes and have it perform the kinematics there too. Or am I missing something here? |
The RT .so (e.g., maxkins.so) does hal_init() + hal_pin_new() in rtapi_app_main(), and kinematicsForward() reads params directly from HAL pin pointers. Loading the same .so in a second process would either conflict on hal_init() or need runtime detection to skip it and read parameters differently. The separate userspace plugin avoids that, it reads HAL pin values through a read-only interface without registering as a HAL component. |
|
Afaik, only when you run But, you don't need to call rtapi_app_main() at all when you yourself do the dlopen(). A call to dlopen() will do nothing more than resolve the dynamic link dependencies. Adding RTLD_LOCAL will prevent exporting any symbols from the loaded .so and the only way to get to them is to use dlsym(). You don't even need worry or care about the kinematicsForward and kinematicsReverse symbols (functions). You can simply split the mathematics inside the kinematics source and implement and export, lets say, as an example, You can also prevent your functions from being exported in a kernel build simply by placing the definition and EXPORT_SYMBOL() invocations in a #ifndef __KERNEL__ conditional. More should not be required. |
|
You're right that dlopen() alone won't call rtapi_app_main(), confirmed. Both approaches work, so here's a comparison from a maintenance perspective: Current approach (math headers + separate plugins): Math extracted into *_math.h, RT modules and userspace plugins both include it Math stays in the .c file, nonrt_* functions exported alongside RT functions |
That is completely optional. You are allowed to export the non-RT functions and they will simply go unused and fill a marginal amount of space. No problem with that. As long as there are no dyn-link refs, but that is a naming question. You only need to make sure that it links, which could mean the requirement of a few stubs. Although, the code can be designed that no or only few stubs are required.
That I see as an advantage because the actual kinematics is in one file. You can reuse code more effectively. You do have to choose carefully what the interface does. You do not want to replicate code from higher layers in the modules.
That is a general issue in all of the components already because of the kernel/userspace boundary. The RT/non-RT boundary is easier to handle. Just make your code run as RT, then it should also run as non-RT. I can't imagine that your use of the kinematics calculations changes its actual behaviour in any meaningful way. Or does it? If not, then it should be a moot issue. The biggest advantage here is that the changeset should be easier to understand and people with their own kinematics component can add/change their code to work with the new way a bit easier. I guess a "how to migrate kinematics components" document would be required in any circumstance. |
|
Kernel modules are pretty restricted in what they can do, whereas in the userspace realtime thread nearly everything is allowed, including C++, exceptions, etc... Things with non-constant upper limit of runtime should be avoided though, and code should only access data and code that is locked and can't be evicted or paged out. Shared memory segment and stack is OK, dynamic memory probably not. The rtapi glue code memlocks all code, but probably not stuff that you dlopen somewhere in a module, so that should be checked. |
|
Agreed, I will go ahead with refactoring, thank you for the guidance.
The dlopen() of the RT .so happens in milltask (non-RT), not in the servo thread. |
|
refactored the Kinematics, much cleaner approach, thank you @BsAtHome for the guidance |
COMPLETED ========= Architecture: * Dual-layer: userspace planning, RT execution * Lock-free SPSC queue with atomic operations * 9D vector abstractions (lines work, arcs TODO) * Backward velocity pass optimizer * Peak smoothing algorithm * Atomic state sharing between layers Critical Fixes: * Optimizer now updates `tc->finalvel` (prevents velocity discontinuities) * Force exact stop mode (`TC_TERM_COND_STOP`) - no blending yet * RT loop calls `tpRunCycle()` every cycle (fixes 92% done bug) * Error handling uses proper `rtapi_print_msg()` instead of `printf()` Verified Working: * Simple linear G-code completes (squares, rectangles) * Acceleration stays within INI limits during normal motion * No blend spikes (fixed) KNOWN LIMITATIONS ================= E-stop: 3x acceleration spike - Tormach has identical behavior (checked their code) - Industry standard for emergency stops - Safety requirement: immediate response - Acceptable for Phase 0 No Blending: Exact stop at every corner - Expected - Phase 4 feature - Prevents acceleration spikes without blend geometry No Arcs: G2/G3 not implemented - Not needed for Phase 0 validation - `tpAddCircle_9D()` stub exists Feed Override: Abrupt changes - Predictive handoff needed (Phase 3) - Works, just not smooth FUTURE PHASES ============= Phase 1: Kinematics in userspace Phase 2: Ruckig S-curve integration Phase 3: Predictive handoff, time-based buffering Phase 4: Bezier blend geometry Phase 5: Hardening, edge cases Phase 6: Cleanup FILES MODIFIED ============== Core Planning: src/emc/motion_planning/motion_planning_9d.cc - Optimizer src/emc/motion_planning/motion_planning_9d_userspace.cc - Segment queueing src/emc/motion_planning/motion_planning_9d.hh - Interface RT Layer: src/emc/motion/control.c - RT control loop fix src/emc/motion/command.c - Mode transitions src/emc/tp/tp.c - Apply optimized velocities src/emc/tp/tcq.c - Lock-free queue operations Infrastructure: src/emc/motion/atomic_9d.h - SPSC atomics src/emc/tp/tc_9d.c/h - 9D vector math src/emc/tp/tc_types.h - Shared data structures TEST ==== G21 G90 F1000 G1 X10 Y0 G1 X10 Y10 G1 X0 Y10 G1 X0 Y0 M2 Expected: Jerky motion (exact stop), completes without errors.
Extract kinematics math into HAL-independent headers and create userspace kinematics infrastructure for trajectory planning. Key changes: - Extract pure math functions from all kinematics modules into *_math.h headers (5axiskins, corexykins, genhexkins, genserkins, lineardeltakins, maxkins, pentakins, pumakins, rosekins, rotarydeltakins, rotatekins, scarakins, scorbotkins, tripodkins, trtfuncs) - Add kinematics_userspace/ with C interface for userspace kinematics - Add HAL pin reader for accessing kinematics parameters from userspace - Add motion_planning modules: Jacobian calculation, joint limits enforcement, path sampling, and userspace kinematics integration - Add kinematics_params.h for shared parameter definitions - Update trajectory planner (tp.c) to support userspace kinematics path - Update INI parsing to load userspace kinematics configuration This enables the 9D planner to compute inverse kinematics and perform singularity-aware velocity limiting without RT kernel calls.
Phase 2 (Ruckig Integration) was merged from master. This phase adds: - Feed override handling with proper velocity/acceleration management - 9-phase profile optimization for trajectory smoothing - Downstream exit velocity capping for segment boundaries - Tangent-aware Jacobian limits for per-axis joint constraints - Backward pass optimization for profile consistency - Jerk-aware motion planning refinements
recomputeDownstreamProfiles backward fixup used end_idx = start_index + recomputed - 1, treating recomputed as a contiguous span from start_index. With force_full runs (resume merge), the forward loop processes the entire queue with SKIP/RECP interspersed, so recomputed is a count of Ruckig solves scattered across the queue, not a contiguous span. The backward fixup window was too small and missed inter-segment violations (e.g. seg162 exit 191 mm/s vs blend entry cap 167.792 mm/s at queue index 25-26 when recomputed=20 gave end_idx=20). Fix: for force_full runs use queue_len-1 as end_idx so backward fixup covers the full processed range. Also write pin_src=5 (DS-pin) on segments capped by backward fixup in the DS context, so Phase 2 does not re-raise their exits in the next tick. Supporting changes included: force_full parameter bypasses time/count budget so merge DS always processes the whole queue; g_merge_recomputed_this_tick suppresses ensureProfilesOnLowBuffer in the same tick to prevent Phase 2 clobbering DS ramp-up profiles; DS-pin guard in computeRuckigProfiles_9D propagates src=5 exits; main profile exit synced to branch exit at merge to prevent stale seeding; stop profiles protected from optimizer overwrite during pause/abort; brake max_accel constrained to sqrt(2*sv*jerk) to avoid acceleration step spikes at spill-over junctions.
Restore trapezoidal braking for v_f≈0 in the backward pass (fast velocity recovery) and add a 1-segment forward-lookahead reachability cap in replanForward to prevent junction spikes. The backward pass uses trapezoidal sqrt(2·a·d) for segments exiting at ~0 (STOP/tail), which overestimates reachable entry velocity vs the forward pass's jerk-limited check. When a kink/maxvel constraint sits between the jerk-limited and trapezoidal answers, the committed profile exits higher than the next segment can enter — causing a junction spike. The forward-lookahead caps scaled_v_exit by jerkLimitedMaxEntryVelocity of the next segment before the SKIP check, so the profile is committed with a reachable exit. This restores the role of the old canStopChainHere() that was removed in 12dc4d6, integrated directly into the forward pass.
Segment compressor retained stale buffered segments across abort. After RT cleanup cleared the queue, the post-abort SYNCH/term-cond reset triggered tpFlushCompressor_9D, pushing the stale segment into the empty queue. RT then executed it, moving the machine to a readahead position far from where it stopped. Reset the compressor at the start of emcTrajAbort, before tpRequestAbortBranch_9D, so no stale buffer survives to be flushed.
Two fmin() calls prevented final_vel from ever increasing after a feed reduction (e.g. 200%→100%), causing 3.9s of velocity oscillation at segment junctions. Backward pass chain: min(chained, stored) capped freshly computed values by the previous pass's stale result. Apply function: fmin(v_new, finalvel) re-capped the backward pass output by the old finalvel every cycle. Both stored values originate from the same backward pass, so the min was self-reinforcing — once set at a higher feed, the limit could never rise.
The backward pass (computeLimitingVelocities_9D) was capped at MAX_LOOKAHEAD_DEPTH=200 segments from the queue tail. After a feed increase (e.g. 100%→200%), segments near the active end had stale final_vel values that produced junction velocity gaps when scaled by the new feed. The backward pass couldn't reach them to propagate updated constraints — it only covered the 200 newest segments. The backward pass is pure arithmetic (~100ns/segment, no Ruckig calls), so covering the full queue costs ~50μs at 500 segments. The depth limit was premature optimization that created a correctness hole.
Replace 8ms debounce gate with PlanningHorizon — a convergence-aware gate that only permits new branches when the system has enough converged queue to survive another branch cycle. safe_depth derived from measured EMA costs (branch, backward, ruckig per-segment) and scanned segment durations — no magic numbers. Two-layer spike prevention: - Gate prevents back-to-back branches (coalesces rapid slider changes) - Gated snapshot returns committed_feed when gate closed, ensuring backward+forward passes always use the same feed within a replan Three feed entry points, all intentional: 1. evaluate() reads live HAL to detect changes 2. computeBranch() receives target_feed for transition profile 3. snapshot() feeds replanForward (sole source for bulk computation) committed_feed ties them together: frozen at branch time, returned by snapshot when gate closed, confirmed at merge. STALE_PROFILE (step 5) now uses current_feed from evaluate path instead of RT's canonical_feed_scale — eliminates the independent feed entry point that RT wrote from live HAL (tp.c:3947,4972). replanForward returns int (segments visited) for gate tracking. Budget boosted to 90% when gate closed (vs 50% steady state). Gate opens from continuation replan, never same cycle as branch, guaranteeing at least one cycle of coalescing.
Replace the old profile-valid + queue-depth gates with a single FINALIZED gate that checks optimization_state >= TC_PLAN_FINALIZED before allowing RT to activate any segment. - Forward pass only stamps FINALIZED when exit boundary conditions are known: EXACT segments (vf=0 always correct), segments with a successor in queue, or tail segments when queue is sealed. - SKIP path re-stamps FINALIZED after backward pass knocks state back to SMOOTHED. - queue_sealed flag in TP_STRUCT: set by tpFlushCompressor_9D at sync points (dwell, mode change, program end), cleared by tpAddLine_9D/tpAddCircle_9D when new motion arrives. Lets the optimizer finalize the tail segment immediately instead of waiting for a successor that will never come. - 200ms safety-net timeout for cases not covered by the seal (first segment after tool change, program start). - Cleanup: removed debug probes (GATE_DBG, ACTIVATE_DBG, QUEUE_DBG, OPT_DBG, SEAL_DBG, XING_DBG, FWD_VF_DBG), stale active-segment rewrite, pessimistic first-profile hack.
No description provided.